Interactive Visualisation and Dashboards#

import pypsa
import atlite
import pandas as pd
import geopandas as gpd
import xarray as xr
import matplotlib.pyplot as plt

plt.style.use("bmh")
Hide code cell content
from urllib.request import urlretrieve
from os.path import basename

urls = [
    "https://tubcloud.tu-berlin.de/s/2oogpgBfM5n4ssZ/download/PORTUGAL-2013-01-era5.nc",
]
for url in urls:
    urlretrieve(url, basename(url))

Load Example Data#

First, let’s load a few example datasets you know from previous tutorials.

A PyPSA network:

n = pypsa.Network(
    "https://tubcloud.tu-berlin.de/s/kpWaraGc9LeaxLK/download/network-cem.nc"
)
/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/pypsa/components.py:323: FutureWarning:

Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[]' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/pypsa/components.py:323: FutureWarning:

Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[]' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/pypsa/components.py:323: FutureWarning:

Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[]' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/pypsa/components.py:323: FutureWarning:

Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[]' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.

INFO:pypsa.io:Retrieving network data from https://tubcloud.tu-berlin.de/s/kpWaraGc9LeaxLK/download/network-cem.nc
WARNING:pypsa.io:Importing network from PyPSA version v0.21.3 while current version is v0.25.1. Read the release notes at https://pypsa.readthedocs.io/en/latest/release_notes.html to prepare your network for import.
INFO:pypsa.io:Imported network network-cem.nc has buses, carriers, generators, global_constraints, loads, storage_units
n.optimize(solver_name="highs");
Hide code cell output
/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/linopy/expressions.py:176: FutureWarning:

the `pandas.MultiIndex` object(s) passed as 'Generator' coordinate(s) or data variable(s) will no longer be implicitly promoted and wrapped into multiple indexed coordinates in the future (i.e., one coordinate for each multi-index level + one dimension coordinate). If you want to keep this behavior, you need to first wrap it explicitly using `mindex_coords = xarray.Coordinates.from_pandas_multiindex(mindex_obj, 'dim')` and pass it as coordinates, e.g., `xarray.Dataset(coords=mindex_coords)`, `dataset.assign_coords(mindex_coords)` or `dataarray.assign_coords(mindex_coords)`.

/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/linopy/expressions.py:176: FutureWarning:

the `pandas.MultiIndex` object(s) passed as 'StorageUnit' coordinate(s) or data variable(s) will no longer be implicitly promoted and wrapped into multiple indexed coordinates in the future (i.e., one coordinate for each multi-index level + one dimension coordinate). If you want to keep this behavior, you need to first wrap it explicitly using `mindex_coords = xarray.Coordinates.from_pandas_multiindex(mindex_obj, 'dim')` and pass it as coordinates, e.g., `xarray.Dataset(coords=mindex_coords)`, `dataset.assign_coords(mindex_coords)` or `dataarray.assign_coords(mindex_coords)`.

/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/linopy/expressions.py:176: FutureWarning:

the `pandas.MultiIndex` object(s) passed as 'StorageUnit' coordinate(s) or data variable(s) will no longer be implicitly promoted and wrapped into multiple indexed coordinates in the future (i.e., one coordinate for each multi-index level + one dimension coordinate). If you want to keep this behavior, you need to first wrap it explicitly using `mindex_coords = xarray.Coordinates.from_pandas_multiindex(mindex_obj, 'dim')` and pass it as coordinates, e.g., `xarray.Dataset(coords=mindex_coords)`, `dataset.assign_coords(mindex_coords)` or `dataarray.assign_coords(mindex_coords)`.

/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/pypsa/optimization/constraints.py:531: FutureWarning:

DataFrame.groupby with axis=1 is deprecated. Do `frame.T.groupby(...)` without axis instead.
INFO:linopy.model: Solve problem using Highs solver
INFO:linopy.io:Writing objective.
Writing constraints.:   0%|          | 0/15 [00:00<?, ?it/s]
Writing constraints.:  47%|████▋     | 7/15 [00:00<00:00, 60.31it/s]
Writing constraints.:  93%|█████████▎| 14/15 [00:00<00:00, 42.12it/s]
Writing constraints.: 100%|██████████| 15/15 [00:00<00:00, 45.70it/s]

Writing continuous variables.:   0%|          | 0/7 [00:00<?, ?it/s]
Writing continuous variables.: 100%|██████████| 7/7 [00:00<00:00, 124.75it/s]
INFO:linopy.io: Writing time: 0.41s
INFO:linopy.solvers:Log file at /tmp/highs.log.
Running HiGHS 1.5.3 [date: 2023-05-16, git hash: 594fa5a9d-dirty]
Copyright (c) 2023 HiGHS under MIT licence terms
Presolving model
INFO:linopy.constants: Optimization successful: 
Status: ok
Termination condition: optimal
Solution: 21906 primals, 50377 duals
Objective: 6.58e+10
Solver model: available
Solver message: optimal
INFO:pypsa.optimization.optimize:The shadow-prices of the constraints Generator-ext-p-lower, Generator-ext-p-upper, StorageUnit-ext-p_dispatch-lower, StorageUnit-ext-p_dispatch-upper, StorageUnit-ext-p_store-lower, StorageUnit-ext-p_store-upper, StorageUnit-ext-state_of_charge-lower, StorageUnit-ext-state_of_charge-upper, StorageUnit-energy_balance were not assigned to the network.
25230 rows, 18665 cols, 69120 nonzeros
25230 rows, 18665 cols, 69120 nonzeros
Presolve : Reductions: rows 25230(-25147); columns 18665(-3241); elements 69120(-32766)
Solving the presolved LP
Using EKK dual simplex solver - serial
  Iteration        Objective     Infeasibilities num(sum)
          0     0.0000000000e+00 Pr: 2190(1.32738e+09) 0s
      10626     1.7421582983e+10 Pr: 5009(1.14194e+13); Du: 0(9.94798e-08) 5s
      17807     5.0729145778e+10 Pr: 3590(3.61953e+11); Du: 0(1.42695e-07) 10s
      23199     6.5813180032e+10 Pr: 0(0); Du: 0(4.35755e-12) 12s
Solving the original LP from the solution after postsolve
Model   status      : Optimal
Simplex   iterations: 23199
Objective value     :  6.5813180032e+10
HiGHS run time      :         13.02
/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/pypsa/optimization/optimize.py:473: FutureWarning:

DataFrame.groupby with axis=1 is deprecated. Do `frame.T.groupby(...)` without axis instead.

Wind, solar and demand time series:

url = (
    "https://tubcloud.tu-berlin.de/s/nwCrNLrtL6LAN3W/download/time-series-lecture-2.csv"
)
ts = pd.read_csv(url, index_col=0, parse_dates=True)

Power plants in Europe

url = (
    "https://raw.githubusercontent.com/PyPSA/powerplantmatching/master/powerplants.csv"
)
ppl = pd.read_csv(url, index_col=0)
geometry = gpd.points_from_xy(ppl["lon"], ppl["lat"])
ppl = gpd.GeoDataFrame(ppl, geometry=geometry, crs=4326)

NUTS2 regions:

url = "https://tubcloud.tu-berlin.de/s/RHZJrN8Dnfn26nr/download/NUTS_RG_10M_2021_4326.geojson"
nuts = gpd.read_file(url).set_index("id").query("LEVL_CODE == 2")

An atlite cutout:

cutout = atlite.Cutout("PORTUGAL-2013-01-era5.nc")
/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/xarray/core/dataset.py:270: UserWarning:

The specified chunks separate the stored chunks along dimension "time" starting at index 100. This could degrade performance. Instead, consider rechunking after loading.

Limitations of Static Plotting with Matplotlib#

You will agree that using matplotlib for static plotting is great for reports, but that it’s lacking some features for interactive visualisation.

ts["onwind [pu]"].plot(figsize=(10, 2))
<Axes: >
_images/6a69d9a49bfd04a97e1741961732a5b6169ba7495def7096e6f3a3ce5f5b9591.png

There are many Python-based interactive plotting libraries out there, and it can be confusing to keep an overview. This tutorial introduces you to two of them:

These two tools allow you to produce shiny interactive figures with minimal code, however, at the expense of fewer customisation options.

hvPlot#

.hvplot() is a powerful and interactive Pandas-like .plot() API. You just replace .plot() with .hvplot() and you get an interactive figure. Simple as that.

It can be installed via conda or mamba in the following way:

conda install -c pyviz hvplot geoviews

Documentation can be found here: https://hvplot.holoviz.org/index.html

To use it, we have to import hvplot.pandas, which makes the .hvplot accessor available on Pandas DataFrame and Series objects, which means that after that df.hvplot becomes a valid statement while before that it would raise an error.

import hvplot.pandas

Let’s try it by plotting onshore wind time series for the year…

ts["onwind [pu]"].hvplot(height=200)
/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/holoviews/core/data/pandas.py:39: FutureWarning:

Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`

/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/holoviews/core/data/pandas.py:39: FutureWarning:

Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`

… or the load time series for February

ts.loc["2015-02", "load [GW]"].hvplot(height=200)
/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/holoviews/core/data/pandas.py:39: FutureWarning:

Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`

/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/holoviews/core/data/pandas.py:39: FutureWarning:

Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`

We can also plot geographic data with hvPlot, for instance, the locations of all hard coal power plants in Europe.

The geo=True declares that the data will be plotted in a geographic coordinate system. Once hvPlot knows that your data is in geo-coordinates, you can use the tiles keyword argument to overlay a the plot on top of map tiles.

Note

For a list of available tiles, look here.

ppl.query("Fueltype == 'Hard Coal'").hvplot(
    geo=True, tiles=True, frame_height=600, frame_width=600
)

Like in geopandas, we can tell hvPlot to plot the point sizes and colors according to columns of the pandas.DataFrame. We can also change the opacity with alpha and the colormap with cmap.

plot = ppl.query("Fueltype == 'Hard Coal'").hvplot(
    geo=True,
    tiles="CartoLight",
    frame_height=600,
    c="DateIn",
    cmap="viridis",
    s="Capacity",
    alpha=0.6,
)
plot

There are a few more options of the graph we can tweak in the opts() section, like which tools should be activated by default.

plot = plot.opts(xaxis=None, yaxis=None, active_tools=["pan", "wheel_zoom"])
plot

All this does not only work with points but also shapes. We can also pick the columns that should be shown when hovering on a shape using hover_cols.

nuts.hvplot(
    geo=True,
    tiles="OSM",
    hover_cols=["NUTS_NAME", "NUTS_ID"],
    c="CNTR_CODE",
    frame_height=500,
    alpha=0.2,
).opts(xaxis=None, yaxis=None, active_tools=["pan", "wheel_zoom"])

We can also use hvPlot for xarray datasets (e.g. atlite cutouts).

For that, we need to import the corresponding xarray accessors.

import hvplot.xarray

So let’s try it by plotting the wind speeds in Portugal as provided by ERA5. The nice thing you will notice is that it will automatically open a panel for dimensions that we did not select explicitly. In this case we can easily sweep across the time dimension. Notice also the customisation options we use here.

cutout.data.hvplot.quadmesh(
    "x",
    "y",
    "wnd100m",
    frame_height=500,
    cmap="Blues",
    geo=True,
    tiles="CartoLight",
    alpha=0.8,
    padding=0.5,
    clim=(0, 10),
)
/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/holoviews/core/util.py:1175: FutureWarning:

unique with argument that is not not a Series, Index, ExtensionArray, or np.ndarray is deprecated and will raise in a future version.

We can also plot the time series of solar generation in Germany on a heatmap:

ts.hvplot.heatmap(
    x="index.hour", y="index.month", C="solar [pu]", cmap="blues"
).aggregate(function="mean")
/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/holoviews/core/data/pandas.py:39: FutureWarning:

Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`

/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/holoviews/core/data/pandas.py:39: FutureWarning:

Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`

hvPlot also offers stacked area charts that come in handy for plotting the power dispatch of a solved PyPSA network:

dispatch = (
    pd.concat([n.generators_t.p, n.storage_units_t.p], axis=1).loc["2015-02"].div(1e3)
)
dispatch.where(dispatch > 0, 0).hvplot.area(
    stacked=True,
    line_width=0,
    width=1300,
    height=350,
    hover=False,
    color=[n.carriers.at[c, "color"] for c in dispatch.columns],
    ylabel="electricity supply [GW]",
    ylim=(0, 180),
)
/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/holoviews/element/chart.py:254: FutureWarning:

Creating a Groupby object with a length-1 list-like level parameter will yield indexes as tuples in a future version. To keep indexes as scalars, create Groupby objects with a scalar level parameter instead.
/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/holoviews/core/data/pandas.py:39: FutureWarning:

Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`

/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/holoviews/core/data/pandas.py:39: FutureWarning:

Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`

hvPlot also has a nice explorer that can be displayed in a Jupyter notebook and that can be used to quickly create customized plots.

hvplot.explorer(pd.DataFrame(ppl))
/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/holoviews/core/data/pandas.py:39: FutureWarning:

Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`

/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/holoviews/core/data/pandas.py:39: FutureWarning:

Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`

Plotly Express#

The plotly.express module (usually imported as px) contains functions that can create entire figures at once. Plotly Express is a built-in part of the plotly library, and is the recommended starting point for creating most common figures. Every Plotly Express function uses graph objects internally and returns a plotly.graph_objects.Figure instance. Throughout the plotly documentation, you will find the Plotly Express way of building figures at the top of any applicable page, followed by a section on how to use graph objects to build similar figures. Any figure created in a single function call with Plotly Express could be created using graph objects alone, but with between 5 and 100 times more code.

Documentation is available here: https://plotly.com/python/plotly-express/

It can be installed via conda or mamba in the following way:

conda install -c conda-forge plotly
import plotly.io as pio
import plotly.express as px
import plotly.offline as py

Note

We need to import plotly.io and plotly.offline, so that the interactive plots are also visible on the course’s static website.

Let’s reproduce the plots we previously created with hvPlot. Onshore wind capacity factor time series:

px.line(ts["onwind [pu]"])
/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/_plotly_utils/basevalidators.py:105: FutureWarning:

The behavior of DatetimeProperties.to_pydatetime is deprecated, in a future version this will return a Series containing python datetime objects instead of an ndarray. To retain the old behavior, call `np.array` on the result

Load time series in February:

px.line(ts.loc["2015-02", "load [GW]"])
/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/_plotly_utils/basevalidators.py:105: FutureWarning:

The behavior of DatetimeProperties.to_pydatetime is deprecated, in a future version this will return a Series containing python datetime objects instead of an ndarray. To retain the old behavior, call `np.array` on the result

Hard coal power plants in Europe:

df = ppl.query("Fueltype == 'Hard Coal'")
px.scatter_mapbox(
    df, lat="lat", lon="lon", mapbox_style="carto-positron", zoom=2, height=600
)
px.scatter_mapbox(
    df,
    lat="lat",
    lon="lon",
    mapbox_style="carto-positron",
    color="DateIn",
    size="Capacity",
    zoom=2,
    height=600,
)
px.choropleth_mapbox(
    nuts,
    geojson=nuts.geometry,
    locations=nuts.index,
    mapbox_style="carto-positron",
    zoom=2,
    height=600,
    color="CNTR_CODE",
    center={"lat": 48, "lon": 12},
)

The integration with xarray datasets is not as nice as in hvPlot.

px.imshow(cutout.data.wnd100m[:, :, 0].T)

But in plotly, hovering information on the area chart works much better.

dispatch = (
    pd.concat([n.generators_t.p, n.storage_units_t.p], axis=1).loc["2015-02"].div(1e3)
)
df = (
    dispatch.where(dispatch > 0, 0)
    .stack()
    .reset_index()
    .rename(columns={"level_1": "technology", 0: "GW"})
)
fig = px.area(df, x="snapshot", color="technology", y="GW", line_group="technology")
fig.update_traces(line=dict(width=0))
fig
/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/_plotly_utils/basevalidators.py:105: FutureWarning:

The behavior of DatetimeProperties.to_pydatetime is deprecated, in a future version this will return a Series containing python datetime objects instead of an ndarray. To retain the old behavior, call `np.array` on the result

Interactive Dashboards#

There are many different options for building interactive dashboards. Some are brand new, some have been around for a few years.

Each of them has different characteristics, for instance in terms of customisation options and ease of use.

If you want to read a detailed comparison, the best one I found is this one:

https://www.datarevenue.com/en-blog/data-dashboarding-streamlit-vs-dash-vs-shiny-vs-voila

Just tell me which one to use

As always, “it depends” – but if you’re looking for a quick answer, you should probably use:

  • Dash if you already use Python for your analytics and you want to build production-ready data dashboards for a larger company.

  • Streamlit if you already use Python for your analytics and you want to get a prototype of your dashboard up and running as quickly as possible.

  • Shiny if you already use R for your analytics and you want to make the results more accessible to non-technical teams.

  • Jupyter if your team is very technical and doesn’t mind installing and running developer tools to view analytics.

  • Voila if you already have Jupyter Notebooks and you want to make them accessible to non-technical teams.

  • Flask if you want to build your own solution from the ground up.

  • Panel if you already have Jupyter Notebooks, and Voila is not flexible enough for your needs.

In this tutorial, we look at streamlit because it is the easiest to get to results quickly. However, compared to other dashboarding libraries, it has more limited configuration options.

Documentation for this package can be found here: https://docs.streamlit.io/

Streamlit can be installed, for example, with conda, mamba or pip:

conda install -c conda-forge streamlit'>=1.18'

or

pip install streamlit

Note

This tutorial requires streamlit>=1.18.

This tutorial is stored on Github with instructions how to install, run and deploy it:

fneum/streamlit-tutorial

You can see a live demo of the final product here:

https://ppm-dash.streamlit.app/